34 research outputs found

    Final FLaReNet deliverable: Language Resources for the Future - The Future of Language Resources

    Get PDF
    Language Technologies (LT), together with their backbone, Language Resources (LR), provide an essential support to the challenge of Multilingualism and ICT of the future. The main task of language technologies is to bridge language barriers and to help creating a new environment where information flows smoothly across frontiers and languages, no matter the country, and the language, of origin. To achieve this goal, all players involved need to act as a community able to join forces on a set of shared priorities. However, until now the field of Language Resources and Technology has long suffered from an excess of individuality and fragmentation, with a lack of coherence concerning the priorities for the field, the direction to move, not to mention a common timeframe. The context encountered by the FLaReNet project was thus represented by an active field needing a coherence that can only be given by sharing common priorities and endeavours. FLaReNet has contributed to the creation of this coherence by gathering a wide community of experts and making them participate in the definition of an exhaustive set of recommendations

    Metadata Curation Strategy

    No full text

    Eenwoordsconstituenten in GrETEL

    No full text

    Towards Semi-Automatic Analysis of Spontaneous Language for Dutch

    No full text
    This paper presents results of an application (Sasta) derived from the CLARIN-developed tool GrETEL for the automatic assessment of transcripts of spontaneous Dutch language. The techniques described here, if successful, (1) have important societal impact, (2) are interesting from a scientific point of view, and (3) may benefit the CLARIN infrastructure itself since they enable a derivative program that can improve the quality of the annotations of Dutch data in CHAT-format

    Annotation in AnnCor

    No full text

    Discovering Resources in CLARIN: Problems and Suggestions for Solutions

    No full text
    This paper describes a range of problems for discovering (mainly linguistically interesting) data via the CLARIN Virtual Language Observatory. It analyzes te causes of these problems and makes a range of suggestions on how the situation can be improved

    De Verleidingen en Gevaren van GrETEL

    No full text
    Corpora are a useful and important source of evidence for linguistic research, but they are not the only kind of evidence, do not have any special status as evidence, and have their limitations. Recent very user-friendly applications such as GrETEL make it very easy to search in large and richly annotated corpora on the basis of an example sentence and without knowledge of a query language or the exact nature of the linguistic annotations. It is therefore very tempting to use these applications intensively. That is fine, but also dangerous in ways, because in many cases, in order to interpret the results correctly, the researcher must really be aware of the precise nature of the linguistic annotations and of the way in which the user-friendly interface generates a query on the basis of an example sentence. I will illustrate this with several examples. I also sketch some methods for avoiding or mitigating the dangers and argue that the applications should support these methods also in as user-friendly a manner as possible
    corecore